128 research outputs found

    HAMAP in 2015: updates to the protein family classification and annotation system.

    Get PDF
    HAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the automatic classification and annotation of protein sequences. HAMAP provides annotation of the same quality and detail as UniProtKB/Swiss-Prot, using manually curated profiles for protein sequence family classification and expert curated rules for functional annotation of family members. HAMAP data and tools are made available through our website and as part of the UniRule pipeline of UniProt, providing annotation for millions of unreviewed sequences of UniProtKB/TrEMBL. Here we report on the growth of HAMAP and updates to the HAMAP system since our last report in the NAR Database Issue of 2013. We continue to augment HAMAP with new family profiles and annotation rules as new protein families are characterized and annotated in UniProtKB/Swiss-Prot; the latest version of HAMAP (as of 3 September 2014) contains 1983 family classification profiles and 1998 annotation rules (up from 1780 and 1720). We demonstrate how the complex logic of HAMAP rules allows for precise annotation of individual functional variants within large homologous protein families. We also describe improvements to our web-based tool HAMAP-Scan which simplify the classification and annotation of sequences, and the incorporation of an improved sequence-profile search algorithm

    Forkhead Transcription Factors (FoxOs) Promote Apoptosis of Insulin-Resistant Macrophages During Cholesterol-Induced Endoplasmic Reticulum Stress

    Get PDF
    OBJECTIVE—Endoplasmic reticulum stress increases macrophage apoptosis, contributing to the complications of atherosclerosis. Insulin-resistant macrophages are more susceptible to endoplasmic reticulum stress–associated apoptosis probably contributing to macrophage death and necrotic core formation in atherosclerotic plaques in type 2 diabetes. However, the molecular mechanisms of increased apoptosis in insulin-resistant macrophages remain unclear

    A classification-based framework for predicting and analyzing gene regulatory response

    Get PDF
    BACKGROUND: We have recently introduced a predictive framework for studying gene transcriptional regulation in simpler organisms using a novel supervised learning algorithm called GeneClass. GeneClass is motivated by the hypothesis that in model organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular microarray experiment based on the presence of binding site subsequences ("motifs") in the gene's regulatory region and the expression levels of regulators such as transcription factors in the experiment ("parents"). GeneClass formulates the learning task as a classification problem — predicting +1 and -1 labels corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. Using the Adaboost algorithm, GeneClass learns a prediction function in the form of an alternating decision tree, a margin-based generalization of a decision tree. METHODS: In the current work, we introduce a new, robust version of the GeneClass algorithm that increases stability and computational efficiency, yielding a more scalable and reliable predictive model. The improved stability of the prediction tree enables us to introduce a detailed post-processing framework for biological interpretation, including individual and group target gene analysis to reveal condition-specific regulation programs and to suggest signaling pathways. Robust GeneClass uses a novel stabilized variant of boosting that allows a set of correlated features, rather than single features, to be included at nodes of the tree; in this way, biologically important features that are correlated with the single best feature are retained rather than decorrelated and lost in the next round of boosting. Other computational developments include fast matrix computation of the loss function for all features, allowing scalability to large datasets, and the use of abstaining weak rules, which results in a more shallow and interpretable tree. We also show how to incorporate genome-wide protein-DNA binding data from ChIP chip experiments into the GeneClass algorithm, and we use an improved noise model for gene expression data. RESULTS: Using the improved scalability of Robust GeneClass, we present larger scale experiments on a yeast environmental stress dataset, training and testing on all genes and using a comprehensive set of potential regulators. We demonstrate the improved stability of the features in the learned prediction tree, and we show the utility of the post-processing framework by analyzing two groups of genes in yeast — the protein chaperones and a set of putative targets of the Nrg1 and Nrg2 transcription factors — and suggesting novel hypotheses about their transcriptional and post-transcriptional regulation. Detailed results and Robust GeneClass source code is available for download from

    The IntAct molecular interaction database in 2012

    Get PDF
    IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. Two levels of curation are now available within the database, with both IMEx-level annotation and less detailed MIMIx-compatible entries currently supported. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications. The IntAct website has been improved to enhance the search process and in particular the graphical display of the results. New data download formats are also available, which will facilitate the inclusion of IntAct's data in the Semantic Web. IntAct is an active contributor to the IMEx consortium (http://www.imexconsortium.org). IntAct source code and data are freely available at http://www.ebi.ac.uk/intact

    Natural Polymorphism in BUL2 Links Cellular Amino Acid Availability with Chronological Aging and Telomere Maintenance in Yeast

    Get PDF
    Aging and longevity are considered to be highly complex genetic traits. In order to gain insight into aging as a polygenic trait, we employed an outbred Saccharomyces cerevisiae model, generated by crossing a vineyard strain RM11 and a laboratory strain S288c, to identify quantitative trait loci that control chronological lifespan. Among the major loci that regulate chronological lifespan in this cross, one genetic linkage was found to be congruent with a previously mapped locus that controls telomere length variation. We found that a single nucleotide polymorphism in BUL2, encoding a component of an ubiquitin ligase complex involved in trafficking of amino acid permeases, controls chronological lifespan and telomere length as well as amino acid uptake. Cellular amino acid availability changes conferred by the BUL2 polymorphism alter telomere length by modulating activity of a transcription factor Gln3. Among the GLN3 transcriptional targets relevant to this phenotype, we identified Wtm1, whose upregulation promotes nuclear retention of ribonucleotide reductase (RNR) components and inhibits the assembly of the RNR enzyme complex during S-phase. Inhibition of RNR is one of the mechanisms by which Gln3 modulates telomere length. Identification of a polymorphism in BUL2 in this outbred yeast population revealed a link among cellular amino acid availability, chronological lifespan, and telomere length control

    Collaborative annotation of genes and proteins between UniProtKB/Swiss-Prot and dictyBase

    Get PDF
    UniProtKB/Swiss-Prot, a curated protein database, and dictyBase, the Model Organism Database for Dictyostelium discoideum, have established a collaboration to improve data sharing. One of the major steps in this effort was the ‘Dicty annotation marathon’, a week-long exercise with 30 annotators aimed at achieving a major increase in the number of D. discoideum proteins represented in UniProtKB/Swiss-Prot. The marathon led to the annotation of over 1000 D. discoideum proteins in UniProtKB/Swiss-Prot. Concomitantly, there were a large number of updates in dictyBase concerning gene symbols, protein names and gene models. This exercise demonstrates how UniProtKB/Swiss-Prot can work in very close cooperation with model organism databases and how the annotation of proteins can be accelerated through those collaborations

    The UniProt-GO Annotation database in 2011

    Get PDF
    The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360 000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set

    Gis1 and Rph1 Regulate Glycerol and Acetate Metabolism in Glucose Depleted Yeast Cells

    Get PDF
    Aging in organisms as diverse as yeast, nematodes, and mammals is delayed by caloric restriction, an effect mediated by the nutrient sensing TOR, RAS/cAMP, and AKT/Sch9 pathways. The transcription factor Gis1 functions downstream of these pathways in extending the lifespan of nutrient restricted yeast cells, but the mechanisms involved are still poorly understood. We have used gene expression microarrays to study the targets of Gis1 and the related protein Rph1 in different growth phases. Our results show that Gis1 and Rph1 act both as repressors and activators, on overlapping sets of genes as well as on distinct targets. Interestingly, both the activities and the target specificities of Gis1 and Rph1 depend on the growth phase. Thus, both proteins are associated with repression during exponential growth, targeting genes with STRE or PDS motifs in their promoters. After the diauxic shift, both become involved in activation, with Gis1 acting primarily on genes with PDS motifs, and Rph1 on genes with STRE motifs. Significantly, Gis1 and Rph1 control a number of genes involved in acetate and glycerol formation, metabolites that have been implicated in aging. Furthermore, several genes involved in acetyl-CoA metabolism are downregulated by Gis1

    Environmental and Genetic Determinants of Colony Morphology in Yeast

    Get PDF
    Nutrient stresses trigger a variety of developmental switches in the budding yeast Saccharomyces cerevisiae. One of the least understood of such responses is the development of complex colony morphology, characterized by intricate, organized, and strain-specific patterns of colony growth and architecture. The genetic bases of this phenotype and the key environmental signals involved in its induction have heretofore remained poorly understood. By surveying multiple strain backgrounds and a large number of growth conditions, we show that limitation for fermentable carbon sources coupled with a rich nitrogen source is the primary trigger for the colony morphology response in budding yeast. Using knockout mutants and transposon-mediated mutagenesis, we demonstrate that two key signaling networks regulating this response are the filamentous growth MAP kinase cascade and the Ras-cAMP-PKA pathway. We further show synergistic epistasis between Rim15, a kinase involved in integration of nutrient signals, and other genes in these pathways. Ploidy, mating-type, and genotype-by-environment interactions also appear to play a role in the controlling colony morphology. Our study highlights the high degree of network reuse in this model eukaryote; yeast use the same core signaling pathways in multiple contexts to integrate information about environmental and physiological states and generate diverse developmental outputs

    Environmental conditions shape the nature of a minimal bacterial genome

    Get PDF
    Of the 473 genes in the genome of the bacterium with the smallest genome generated to date, 149 genes have unknown function, emphasising a universal problem; less than 1% of proteins have experimentally determined annotations. Here, we combine the results from state-of-the-art in silico methods for functional annotation and assign functions to 66 of the 149 proteins. Proteins that are still not annotated lack orthologues, lack protein domains, and/ or are membrane proteins. Twenty-four likely transporter proteins are identified indicating the importance of nutrient uptake into and waste disposal out of the minimal bacterial cell in a nutrient-rich environment after removal of metabolic enzymes. Hence, the environment shapes the nature of a minimal genome. Our findings also show that the combination of multiple different state-of-the-art in silico methods for annotating proteins is able to predict functions, even for difficult to characterise proteins and identify crucial gaps for further development
    corecore